Metric Lexical Analysis
نویسندگان
چکیده
We study automata-theoretic properties of distances and quasi-distances between words. We show that every additive distance is nite. We also show that every additive quasi-distance is regularity-preserving, that is, the neighborhood of any radius of a regular language with respect to an additive quasi-distance is regular. As an application we present a simple algorithm that constructs a metric (fault-tolerant) lexical analyzer for any given lexical analyzer and desired radius (fault-tolerance index).
منابع مشابه
Measuring Conceptual Distance Using WordNet: The Design of a Metric for Measuring Semantic Similarity*
This paper describes the development of a metric for measuring the semantic distance or similarity of words using the WordNet lexical database. Such a metric could be of use in development of search engines and text retrieval systems, tasks for which the richness of natural language can cause difficulty. Further, such a metric can prove invaluable to psycholinguists who wish to study lexical se...
متن کاملExamining the Effect of Ideology and Idiosyncrasy on Lexical Choices in Translation Studies within the CDA Framework
Using a critical discourse analytic model of translation criticism, the present study attempts to explore the effect of ideology and idiosyncrasy on the lexical choices in translation studies. The study employed a descriptive approach to answer two research questions: Is there any relationship between ideology and idiosyncratic features of translators' lexical choices? And if yes, can it be ana...
متن کاملA Comparative Analysis of Lexical Bundles in Journalistic Writing in English and Persian: A Contrastive Linguistic Perspective
This paper investigates the use of ‘lexical bundles’ in two broad corpora of journalistic writing. The aim of this study is to compare the use of lexical bundles in the two domains, one consisted of newspaper articles written in English and published in England and the other one comprised of newspaper articles written in Persian from Iranian publications. For this purpose, the frequency...
متن کاملNative and Non-native Use of Lexical Bundles in Discussion Section of Political Science Articles
The study of lexical bundles, among types of text analysis, is gaining importance over the others in the last century. The present study employed a frequency-based analysis approach to the use of lexical bundles. The discussion section of 60 political science articles, with corpora around 253,063 words were investigated in three aspects of structure, form, and function of lexical bundles. The p...
متن کاملPEM: A Paraphrase Evaluation Metric Exploiting Parallel Texts
We present PEM, the first fully automatic metric to evaluate the quality of paraphrases, and consequently, that of paraphrase generation systems. Our metric is based on three criteria: adequacy, fluency, and lexical dissimilarity. The key component in our metric is a robust and shallow semantic similarity measure based on pivot language N-grams that allows us to approximate adequacy independent...
متن کامل